-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement resource budgets in dagcbor parsing. #85
Conversation
) | ||
|
||
const ( | ||
mapEntryGasScore = 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how are you deriving these? is it a rough approximation of the overhead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
Fairly arbitrarily derived. I figure it takes -- very, very roughly -- at least this many bytes in memory to add another map entry to memory.
Is this number wrong? Absolutely. I have no idea how many bytes it takes to add a map entry in memory. It's a property of the golang native map implementation, and varies based on the size of the map, which is not something I'm particularly interested in predicting or making specific claims about... and then, even that is only in some Nodes. It's actually zero in some nodes (e.g. codegen'd structs). Or who knows what in some other Node implementation: it's an interface users can supply their own of, after all.
It's sufficient to have a limit; I'm not sure it's necessarily important for the limit to be easy to intuit.
sgtm, we'd just better make sure that all instances of allocation, especially large allocation, are accounted for into the future! it's going to be easy to overlook this. Also, "gas" used like this really is an Americanism. Not a major objection because it's for internal use, just a note that non-americans have to do a double cognitive jump to get to what it's trying to mean when we read "gas" (I have the same problem with Filecoin; it goes like this: "gas? ohhh, they're Americans, there's a fuel tank involved with a fixed capacity, ok"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally SGTM, though I would like more docs around the gas scores like Rod says.
If we ever expose this as an option, I do think it should be pretty much around byte sizes, because the user shouldn't have to learn what a made up term like "allocation gas" means.
FWIW, I'm mostly pulling "gas" from Ethereum. It's the first digital project I'm aware of that coined a term for this concept and made it popularized. I'm pretty sure my automata course in college introduced the same core concept right around where it discussed solutions to the halting problem, but whatever the heck it was called in that literature, it sure wasn't memorable. "Gas" is. I agree with the need for more documentation around this. I don't actually know how to reduce it directly to byte sizes though, and I'm not even sure that's possible, due to all the variations that are potentially involved, discussed in #85 (comment) . |
I'm going to merge this for now because it's been out long enough and I'm pretty sure it's better than things were before. More PRs in the future to improve docs and configurability will be highly welcome. We'll also probably want to port this to other codecs -- either in shared code if possible, or at least in shared pattern. |
This is an alternative fix for #82 .
This diff keeps a single resource budget throughout the entire parse and unmarshal process, decrementing it as resources are consumed. Where possible, it also checks if a length header suggests the resource limit will be exhausted within the next stretch.
This does not yet include a configuration mechanism, but I've set it to "around 10 megs" for now, which is probably a reasonable safe number to roll out today as a default (considering that in many projects that use IPLD, libp2p is also in play, which sets much lower limits for serial message size already). I'd certainly like to make it properly configurable, but perhaps that can be explored later.